智能论文笔记

Online Learning in Supply-Chain Games

Nicolò Cesa-Bianchi , Tommaso Cesari , Takayuki Osogami , Marco Scarsini , Segev Wasserkrug

分类：机器学习

2022-07-08

我们研究供应商和零售商之间的重复游戏，他们希望在不了解问题参数的情况下最大化各自的利润。在用完整的信息表征了舞台游戏的Stackelberg平衡的独特性之后，我们表明，即使有部分了解需求和生产成本的联合分配，自然学习动态也可以保证供应商和零售商共同策略概况的收敛，舞台游戏的平衡。我们还证明了供应商对零售商的遗憾的遗憾和渐近界限的有限时间界限，在该零售商的遗憾中，特定费率取决于玩家初步可用的知识类型。在特殊情况下，当供应商不是战略性的（垂直整合）时，我们证明，当成本和需求是在对抗性和需求时，零售商的遗憾（或等同于社会福利）对零售商的遗憾（或等效地是社会福利）的最佳遗憾。

translated by 谷歌翻译

Point Cloud-based Proactive Link Quality Prediction for Millimeter-wave Communications

Shoki Ohta , Takayuki Nishio , Riichi Kudo , Kahoko Takahashi , Hisashi Nagata

分类：人工智能 | 计算机视觉 | 机器学习

2023-01-02

This study demonstrates the feasibility of point cloud-based proactive link quality prediction for millimeter-wave (mmWave) communications. Image-based methods to quantitatively and deterministically predict future received signal strength using machine learning from time series of depth images to mitigate the human body line-of-sight (LOS) path blockage in mmWave communications have been proposed. However, image-based methods have been limited in applicable environments because camera images may contain private information. Thus, this study demonstrates the feasibility of using point clouds obtained from light detection and ranging (LiDAR) for the mmWave link quality prediction. Point clouds represent three-dimensional (3D) spaces as a set of points and are sparser and less likely to contain sensitive information than camera images. Additionally, point clouds provide 3D position and motion information, which is necessary for understanding the radio propagation environment involving pedestrians. This study designs the mmWave link quality prediction method and conducts two experimental evaluations using different types of point clouds obtained from LiDAR and depth cameras, as well as different numerical indicators of link quality, received signal strength and throughput. Based on these experiments, our proposed method can predict future large attenuation of mmWave link quality due to LOS blockage by human bodies, therefore our point cloud-based method can be an alternative to image-based methods.

translated by 谷歌翻译

SuperGF: Unifying Local and Global Features for Visual Localization

Wenzheng Song , Ran Yan , Boshu Lei , Takayuki Okatani

分类：计算机视觉

2022-12-23

Advanced visual localization techniques encompass image retrieval challenges and 6 Degree-of-Freedom (DoF) camera pose estimation, such as hierarchical localization. Thus, they must extract global and local features from input images. Previous methods have achieved this through resource-intensive or accuracy-reducing means, such as combinatorial pipelines or multi-task distillation. In this study, we present a novel method called SuperGF, which effectively unifies local and global features for visual localization, leading to a higher trade-off between localization accuracy and computational efficiency. Specifically, SuperGF is a transformer-based aggregation model that operates directly on image-matching-specific local features and generates global features for retrieval. We conduct experimental evaluations of our method in terms of both accuracy and efficiency, demonstrating its advantages over other methods. We also provide implementations of SuperGF using various types of local features, including dense and sparse learning-based or hand-crafted descriptors.

translated by 谷歌翻译

Best-Answer Prediction in Q&A Sites Using User Information

Rafik Hadfi , Ahmed Moustafa , Kai Yoshino , Takayuki Ito

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-15

Community Question Answering (CQA) sites have spread and multiplied significantly in recent years. Sites like Reddit, Quora, and Stack Exchange are becoming popular amongst people interested in finding answers to diverse questions. One practical way of finding such answers is automatically predicting the best candidate given existing answers and comments. Many studies were conducted on answer prediction in CQA but with limited focus on using the background information of the questionnaires. We address this limitation using a novel method for predicting the best answers using the questioner's background information and other features, such as the textual content or the relationships with other participants. Our answer classification model was trained using the Stack Exchange dataset and validated using the Area Under the Curve (AUC) metric. The experimental results show that the proposed method complements previous methods by pointing out the importance of the relationships between users, particularly throughout the level of involvement in different communities on Stack Exchange. Furthermore, we point out that there is little overlap between user-relation information and the information represented by the shallow text features and the meta-features, such as time differences.

translated by 谷歌翻译

Interpretable Edge Enhancement and Suppression Learning for 3D Point Cloud Segmentation

Haoyi Xiu , Xin Liu , Weimin Wang , Kyoung-Sook Kim , Takayuki Shinohara , Qiong Chang , Masashi Matsuoka

分类：计算机视觉

2022-09-20

3D点云可以灵活地表示连续表面，可用于各种应用；但是，缺乏结构信息使点云识别具有挑战性。最近的边缘感知方法主要使用边缘信息作为描述局部结构以促进学习的额外功能。尽管这些方法表明，将边缘纳入网络设计是有益的，但它们通常缺乏解释性，使用户想知道边缘如何有所帮助。为了阐明这一问题，在这项研究中，我们提出了以可解释方式处理边缘的扩散单元（DU），同时提供了不错的改进。我们的方法可以通过三种方式解释。首先，我们从理论上表明，DU学会了执行任务呈纤维边缘的增强和抑制作用。其次，我们通过实验观察并验证边缘增强和抑制行为。第三，我们从经验上证明，这种行为有助于提高绩效。在具有挑战性的基准上进行的广泛实验验证了DU在可解释性和绩效增长方面的优势。具体而言，我们的方法使用S3DIS使用Shapenet零件和场景分割来实现对象零件分割的最新性能。我们的源代码将在https://github.com/martianxiu/diffusionunit上发布。

translated by 谷歌翻译

Neural Architecture Search for Improving Latency-Accuracy Trade-off in Split Computing

Shoma Shimizu , Takayuki Nishio , Shota Saito , Yoichi Hirose , Chen Yen-Hsiu , Shinichi Shirakawa

分类：机器学习

2022-08-30

本文提出了一种用于拆分计算的神经体系结构搜索（NAS）方法。拆分计算是一种新兴的机器学习推理技术，可解决在物联网系统中部署深度学习的隐私和延迟挑战。在拆分计算中，神经网络模型通过网络使用Edge服务器和IoT设备进行了分离和合作处理。因此，神经网络模型的体系结构显着影响通信有效载荷大小，模型准确性和计算负载。在本文中，我们解决了优化神经网络体系结构以进行拆分计算的挑战。为此，我们提出了NASC，该NASC共同探讨了最佳模型架构和一个拆分点，以达到延迟需求（即，计算和通信的总延迟较小，都比某个阈值较小）。 NASC采用单发NAS，不需要重复模型培训进行计算高效的体系结构搜索。我们使用硬件（HW） - 基准数据的NAS基础的绩效评估表明，拟议的NASC可以改善``通信潜伏期和模型准确性''的权衡，即，将延迟降低了约40-60％，从基线降低了约40-60％有轻微的精度降解。

translated by 谷歌翻译

HTML版本

More Practical Scenario of Open-set Object Detection: Open at Category Level and Closed at Super-category Level

Yusuke Hosoya , Masanori Suganuma , Takayuki Okatani

分类：计算机视觉

2022-07-20

开放式对象检测（OSOD）最近引起了广泛的关注。它是在正确检测/分类已知对象的同时检测未知对象。我们首先指出，最近的研究中考虑的OSOD方案，该方案考虑了类似于开放式识别（OSR）的无限种类的未知物体，这是一个基本问题。也就是说，我们无法确定要检测到的内容，而对于这种无限的未知对象，这是检测任务所必需的。这个问题导致了对未知对象检测方法的性能的评估困难。然后，我们介绍了OSOD的新颖方案，该方案仅处理与已知对象共享超级类别的未知对象。它具有许多真实的应用程序，例如检测越来越多的细粒对象。这个新环境摆脱了上述问题和评估困难。此外，由于已知和未知对象之间的视觉相似性，它使检测到未知对象更加现实。我们通过实验结果表明，基于标准检测器类别预测的不确定性的简单方法优于先前设置中测试的当前最新OSOD方法。

translated by 谷歌翻译

GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features

Van-Quang Nguyen , Masanori Suganuma , Takayuki Okatani

分类：计算机视觉 | 人工智能 | 自然语言处理

2022-07-20

图像字幕的当前最新方法采用基于区域的特征，因为它们提供了对象级信息，对于描述图像的内容至关重要；它们通常由对象检测器（例如更快的R-CNN）提取。但是，他们有几个问题，例如缺乏上下文信息，不准确检测的风险以及高计算成本。可以通过使用基于网格的功能来解决前两个。但是，如何提取和融合这两种功能是未知的。本文提出了一种仅使用变压器的神经结构，称为砂砾（基于网格和区域的图像字幕变压器），该构建物有效地利用了两个视觉特征来生成更好的字幕。粒度用基于DITR的方法代替了以前方法中使用的基于CNN的检测器，从而使其更快地计算。此外，它的整体设计仅由变压器组成，可以对模型进行端到端的训练。这种创新的设计和双重视觉功能的集成带来了重大的性能提高。几个图像字幕基准的实验结果表明，砂砾的推论准确性和速度优于先前的方法。

translated by 谷歌翻译

Single-image Defocus Deblurring by Integration of Defocus Map Prediction Tracing the Inverse Problem Computation

Qian Ye , Masanori Suganuma , Takayuki Okatani

分类：计算机视觉

2022-07-07

在本文中，我们考虑了Defocus图像去缩合中的问题。以前的经典方法遵循两步方法，即首次散焦映射估计，然后是非盲目脱毛。在深度学习时代，一些研究人员试图解决CNN的这两个问题。但是，代表模糊级别的Defocus图的简单串联导致了次优性能。考虑到Defocus Blur的空间变体特性和Defocus Map中指示的模糊级别，我们采用Defocus Map作为条件指导来调整输入模糊图像而不是简单串联的特征。然后，我们提出了一个基于Defocus图的空间调制的简单但有效的网络。为了实现这一目标，我们设计了一个由三个子网络组成的网络，包括DeFocus Map估计网络，该网络将DeFocus Map编码为条件特征的条件网络以及根据条件功能执行空间动态调制的DeFocus Deblurring网络。此外，空间动态调制基于仿射变换函数，以调整输入模糊图像的特征。实验结果表明，与常用的公共测试数据集中的现有最新方法相比，我们的方法可以实现更好的定量和定性评估性能。

translated by 谷歌翻译

Learning Regularized Multi-Scale Feature Flow for High Dynamic Range Imaging

Qian Ye , Masanori Suganuma , Jun Xiao , Takayuki Okatani

分类：计算机视觉

2022-07-06

从一组多曝光图像中重建无精神的高动态范围（HDR）图像是一项具有挑战性的任务，尤其是在大型对象运动和闭塞的情况下，使用现有方法导致可见的伪影。为了解决这个问题，我们提出了一个深层网络，该网络试图学习以正规损失为指导的多尺度特征流。它首先提取多尺度功能，然后对非参考图像的特征对齐。对齐后，我们使用残留的通道注意块将不同图像的特征合并。广泛的定性和定量比较表明，我们的方法可实现最新的性能，并在颜色伪像和几何变形大大减少的情况下产生出色的结果。

translated by 谷歌翻译